Fitting in a complex landscape using an optimized hypersurface sampling 
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Fitting a data set with a parametrized model can be seen geometrically as finding the global 
minimum of the hypersurface, depending on a set of parameters {Pi}. This is usually done 
using the Levenberg-Marquardt algorithm. The main drawback of this algorithm is that despite 
of its fast convergence, it can get stuck if the parameters are not initialized close to the final 
solution. We propose a modification of the Metropolis algorithm introducing a parameter step tuning 
that optimizes the sampling of parameter space. The ability of the parameter tuning algorithm 
together with simulated annealing to find the global hypersurface minimum, jumping across 
X^{Pi) barriers when necessary, is demonstrated with synthetic functions and with real data. 
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INTRODUCTION 

Fitting a parametrized model to experimental results 
is the most usual way to obtain the physics hidden be- 
hind data. However, as nicely reported by Transtrum 
et al. this can be quite challenging and it usually 
takes "weeks of human guidance to find a good starting 
point" . Geometrically, the problem of finding a best fit 
corresponds to finding the global minimum of the hy- 
persurface. As this hypersurface is often full of fissures, 
local minima prohibit an efficient search. The human 
guidance consists usually of a set of tricks (depending on 
every particular problem) that allow to choose the start- 
ing point in this landscape such that the first minimum 
found is indeed the global minimum. 

This problem is usually due to the mechanism that 
is behind classical fit algorithms such as Levenberg- 
Marquardt (LM) 0: a set of parameters {Pi} is opti- 
mized by varying the parameters and accepting the mod- 
ified parameter set as a starting point for the next iter- 
ation only if this new set reduces the value of a cost or 
merit function such as x^. From a geometrical point of 
view, those algorithms allow only downhill movements in 
the X^{Pi} hypersurface. Therefore they can get stuck 
in local minima or get lost in flat regions of the land- 
scape This means that they are only able to find an 
optimal solution if they are initialized around the abso- 
lute minimum of the hypersurface. 

The challenge of finding the global minimum can be al- 
ternatively tackled by Bayesian methods 0, Q as demon- 
strated in different fields such as astronomy or biology Q , 
solid state physics Q, quasielastic neutron scattering 
data analysis Q, and Reverse Monte Carlo methods Q. 
We follow a Bayesian approach to the fit problem in this 
contribution. This method is based on another mecha- 
nism to wander around in parameter space: instead of 



allowing only downhill movements, parameter changes 
that increase can also be accepted if the change in 
is compatible with the data errors. 

To do that, a Markov Chain Monte Carlo (MCMC) 
method is used, where the Markov Chains are generated 
by the Metropolis algorithm Q. However, while in the 
case of the LM algorithm the initialization of parameters 
is critical to the convergence of the algorithm, it is here 
the tuning of the maximum parameter change allowed 
at each step (called parameter jumps hereafter) that will 
decide the success of the algorithm to find the global 
X^{Pi} minimum in an efficient way. 

If the parameter jumps are chosen too small, the algo- 
rithm will always accept any parameter change, getting 
lost in irrelevant details of the x^{Pi} landscape. If cho- 
sen too large, the parameters will hardly be accepted and 
the algorithm will get stuck every now and then. More- 
over, in the case of models defined by more than one pa- 
rameter, when parameter jumps are not properly chosen, 
the parameter space can be over-explored in the direc- 
tion of those parameters with too small jump lengths, in 
other words, the model would be insensitive to the pro- 
posed change of these parameters. On the other hand, 
some other parameters can be associated to a jump so 
big that changes are hardly ever accepted. 

Different schemes have been proposed in order to 
change parameter jumps to explore the target distribu- 
tion efficiently using Markov Chains under the generic 
name of adaptive MCMC [13 ■ Using the framework 
of the Stochastic Approximation [lol | we present in this 
work an algorithm bel ong ing to the group of "Controlled 
Markov Chains" [ll|, Il2| where the calculation of new 
parameter jumps takes the history of the Markov Chain 
and previous parameter jumps into account. 

Two main approaches are known which take the 
Markov Chain history into account: Adaptive Metropo- 
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lis (AM) algorithms (implemented for example in 
PyMC [IJI) and algorithms that use rules following 
Robbins- Monro update [l^ . [Til . [l6| . In the first case, 
parameter jumps are tuned using the covariance matrix 
at every step, so that once the adaptation is finished the 
algorithm should be wandering with a parameter jump 
close to the "error" of the parameter (defined as the vari- 
ance of the posterior parameter PDF). In some cases, this 
kind of algorithm [17| can get stuck if the acceptance ra- 
tio of a parameter is too high or too low. In this case the 
Markov Chain stops learning from the past history, thus 
the optimization is stopped with suboptimal parameter 
jumps. This problem is overcome by Robbins-Monro up- 
date rules that change parameter jumps so that they arc 
accepted with an optimal ratio. 

The main danger of optimized Metropolis algorithms 
is that adaptation might cause the Markov Chain to not 
converge to the target distribution anymore. In other 
words, the Markov Chain might lose its ergodicity. For 
example in the case of AM algorithms, the generated 
chain is not Markovian since it depends on the history 
of the chain. However, as demonstrated by Haario ct 
al. 13|, the chain is able to reproduce the target distri- 
bution, i.e. is ergodic. In the second type of algorithms, 
the Robbins-Monro type, ergodicity properties must be 
assured by updating only at regeneration times [l5| . In 
any case, as pointed out by Andrieu et al. [l3| the conver- 
gence to the target distribution is assured if optimization 
vanishes. In other words, if parameter jumps oscillate 
around a fixed value the ergodic property of the Markov 
Chain is assured. 

The presented algorithm is based on the stochastic ap- 
proach of Robbins-Monro with an updating rule inspired 
by the one of Gilks et al. [isj . Optimization of parameter 
jumps is therefore performed with two goals in mind: 

• To calculate them in such a way that all param- 
eters are accepted with the same ratio. Adjusting 
parameter jumps so that all parameter changes will 
have the same acceptance ratio is important to ex- 
plore the X^iPi} landscape with the same efficiency 
in all parameter directions. 

• To adjust parameter jumps to a value tailored to 
the stage of the fit. This will turn out to be impor- 
tant when exploring the x^{Pi} hypersurfacc using 
the simulated annealing technique [l8| . since this 
allows the parameter jumps to be optimized to ex- 
plore X^{Pi} (see subsection fitting in a complex 
landscape): at the beginning of the fit process the 
algorithm will set parameter jumps to a large value 
to explore large portions of the landscape, and 
at the final stages these parameter jumps will be 
set to small values by the same algorithm in order 
to find its absolute minimum. 

Geometrically, we can interpret the algorithm as set- 



ting the parameter step sizes to a value related to the 
hypersurface landscape. First, it modifies the parame- 
ter jump to take into account the shape of the hyper- 
surface along a parameter direction. If x^i^k} (the cut 
along a parameter k) is flat (the parameter direction is 
"sloppy" following Sethna's nomenclature 0), the pa- 
rameter step size is set to a larger value, and parameters 
will move faster in this sloppy direction. On the con- 
trary, in the directions where the X^{Pk} has a larger 
slope (the "stiff" direction following Sethna's nomencla- 
ture), parameter steps will be set to a smaller value so 
that they are accepted with the same as the previous 
ones. Second, it modifies the parameter jumps to take 
the shape of the global x^ landscape into account when 
the simulated annealing is used. At the beginning of the 
fit parameter jumps will be set to a large value so that 
details of x^{Pk}, i-e. local minima, will be smeared out, 
making it easier to find the global minimum. However, 
during the last steps of the fitting process, parameter 
steps will be set to a small value by the algorithm so that 
the system will be allowed to relax inside the minimum. 

The present work gives a detailed description on how 
the algorithm works, and will be organized as follows: 
We first recall briefly on the Metropolis method applied 
to generate Markov Chains. In the next section, the pro- 
posed algorithm to optimize the parameter step size is 
introduced. Afterwards, we check its robustness to find 
optimized parameter jumps using a simple test function; 
and finally we test the ability of the regenerative algo- 
rithm combined with the simulated annealing technique 
to find the global minimum of x^, even with poor initial- 
ization values, using a simple function with a complex 
X^{Pi} landscape. The algorithm presented in this work 
has been implemented in the program FAB ADA (20j . 



THE FIT METHOD 
Fitting with the Bayesian ansatz 

Fitting data using the Metropolis algorithm is based 
on an iterative process where successively proposed pa- 
rameter sets are accepted according to the probability 
that these parameters describe the actual data, given all 
available evidence. Hence this method makes use of our 
knowledge of the error bars of the data. 

We now briefiy recall how this can be done using a 
Metropolis algorithm, to proceed in the next section with 
the algorithm to adjust parameter jumps. 

We should first start with the probabilistic bases be- 
hind the definition. The probability F{H \ D) that 
an hypothesis H is correctly describing an experimental 
result D is related to the likelihood P(r' | H) that exper- 
imental data Dk [k = l,...,n) are correctly described 
by a model or hypothesis {k — 1, . . . , n); using Bayes 
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theorem 



\Hk \Dk) = 



(1) 



where F{Hk \ Dk) is called the posterior, the probabil- 
ity that the hypothesis is in fact describing the data. 
P(_Dfc I Hk) is the likelihood, the probability that the de- 
scription of the data by the hypothesis is good. V{Hk) is 
called the prior, the probability density function (PDF) 
summarizing the knowledge we have about the hypothe- 
sis before looking at the data. V{Dk) is a normalization 
factor to assure that the integrated posterior probability 
is unity. 

In the following we will assume no prior knowledge 
(maximum ignorance prior Q ) , in this special case Bayes 
theorem takes the simple form 



V{Hk \Dk)<x ¥{Dk \Hk) = L 



(2) 



where L is a short notation for likelihood. 

Although this is by no means a prerequisite, we will 
assume in the following that the likelihood that every 
single data point Dk described by the model or hypoth- 
esis Hk follows a Gaussian distribution. The case of a 
Poisson distribution was discussed previously [llj. For 
data with a Gaussian distributed uncertainty with width 
a, the likelihood for each individual data point takes the 
form 
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and correspondingly, the likelihood that the whole data 
set is described by this hypothesis is 
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The Metropolis algorithm will in this special case con- 
sist on the proposition of successive sets of parameters 
{Pi}. A new set of parameters is generated changing one 
parameter at a time using the rule 
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where /S.P™^^ is the maximum change allowed to the pa- 
rameter or parameter jump and r is a random number 
between -1.0 and 1.0. The new set of parameters will 
always be accepted if it lowers the value of , or, if the 
opposite happens it will be accepted with a probability 
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where Xi+i E^nd Xi correspond to the for the proposed 
new set of parameters and the old one, respectively. Oth- 
erwise, this new parameter value will be rejected and the 
fit function does not change during this step. 

The Metropolis algorithm described here is very similar 
to the one used in statistical physics to find the possible 
molecular configurations (microstates) at a given temper- 
ature. In that case the algorithm minimizes the energy of 
the system while allowing changes in molecular positions 
that yield an increase of the energy if it is compatible 
with the temperature. 

Inspired by the similarities between fitting data us- 
ing a Bayesian approach and molecular modeling using 
Monte Carlo methods, a simulated annealing procedure 
proposed by Kirkpatrick [isf might optionally be used 
(see for example j22L [23|). Following the idea of that 
work, the x^ landscape might be compared with an en- 
ergy landscape used to describe glassy phenomena [2^ . 
What we do is to start at high temperatures, i.e. in 
the liquid phase, where details of the energy landscape 
are not so important. By lowering the temperature fast 
enough the system might fall into a local minima, i.e. in 
the glassy phase. In that case the system is quenched 
as it is normally done by standard fitting methods. The 
presented algorithm aims to avoid being trapped in lo- 
cal minima using an "annealing schedule" as suggested 
by Kirkpatrick. This is done by artificially increasing the 
errors of the data to be fitted and letting the errors slowly 
relax until they reach their true values. Because this is 
very similar to what is performed in molecular modeling, 
the parameter favoring the uphill movements in equation 
[7] is usually called temperature, yielding the acceptance 
rule 



P(g(/^^+^) I Dk) 

miPi) I Dk) 



exp 



Xf+i 



Xi 



2 • T 



(7) 



As it happens with Monte Carlo simulations, increas- 
ing the temperature will increase the acceptance of pa- 
rameter sets that increase x^i thus making the jump over 
X^ barriers between minima easier. 



Adjusting the parameter step size 

The objective of tuning the parameter step size is to 
choose a proper value for AP"^^'^ in equation [5] to opti- 
mize the parameter space exploration. 

Given the total number of algorithm steps N and the 
number of steps that yield a change in x^, i. e. the number 
of successful attempts, K, the ratio R of steps yielding a 
change is i? = K/N. i?dosired is defined as the ratio 
with which some parameter should be accepted in a step. 
As we want every parameter to be changed with the same 
ratio, desired = ^desired/'T? where m is the number of 
parameters. 
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The algorithm is initiahzed with a first guess for the 
parameter step sizes. This first guess, as wiU be seen 
shortly, is not important due to the fast convergence of 
the algorithm to the optimized values. The calculation of 
a new AP™''^, i.e. the regeneration of the Markov Chain, 
is done after N steps, i.e. at regeneration times, through 
the equation 
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where Ri is the actual acceptance ratio of parameter i. 
Following the previous equation, if the calculated ratio 
Ri/ Ri. desired IS equal to oue, i.e. if all parameters are 
changing with the same predefined ratio, AP™^'^ will not 
be changed. 

If during the fit process a change of parameter Pi is 
too often accepted, the parameter space is being over ex- 
plored with regard to parameter i. The algorithm will 
then make AP™'''^ larger in order to reduce its accep- 
tance. The contrary happens if the acceptance is too low 
for a parameter: the algorithm makes AP™'''*^ smaller to 
increase its acceptance ratio. This will set different step 
sizes for each parameter, making the exploration of all of 
them equally efficient. 



DEMONSTRATIONS OF FITTING FUNCTIONS 
Fitting in a well-behaved landscape 

The optimization of the parameter step size is shown 
using the Gaussian function 
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where A is the amplitude, W is the width and C is the 
center of the Gaussian. A function has been generated 
with the parameter set {A, W, C} = {10, 1, 5} and a nor- 
mally distributed error with cr = 0.1 was added. A series 
of tests with different initial values for parameter jumps 
and different desired acceptance ratios have been carried 
out (see below for details) . The initial parameters for the 
fit were {A, W, C} = {2, 2, 2}. In all cases the algorithm 
was able to fit the data as can be seen in[TJ 

The parameter step size was adjusted every 1000 steps. 
Three cases are shown in figure [2] an initial AP™'^^ of 
10 (a very large jump compared to the parameter values, 
nearly always resulting in a rejection of the new param- 
eters) and an Pdosirod of 66%, the same AP™'^^ with an 
i?desired of 9% and finally a Ai^"'''^ of lO"'' (a very smaU 
jump compared to the parameter values, resulting in a 
slow exploration of the parameter space) and an Pdosirod 
of 9%. It can be seen that the algorithm manages in all 
these extreme cases to adapt the jump size quickly and 
reliably in order to make R equal to Pdcsirod- 




FIG. 1: Circles: Generated Gaussian function to test the al- 
gorithm with the parameters {A, W, C} = {10, 1, 5}. Dashed 
line: starting point for all performed tests ({^4, W, C} = 



{2,2,2}). Solid line: 
Gaussian function. 
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FIG. 2: Total acceptance ratio i? as a function of the num- 
ber of steps when -Rdosired is set to 66% and 9% (solid and 
dashed or dotted lines). In the second case (-Rdosired ~ 9%), 
dashed and dotted lines represent the values of 7? as a func- 
tion of algorithm step for two different parameter step size 
initializations (AP™'"' = 10 and AP™^=' = 10"" respectively) 



In figure [3] we show the three individual acceptance 
ratios Ri for the different parameters as a function of 
the fit steps for different initialization values of the pa- 
rameter jumps APi, for different values of Pdesirod, and 
setting the number of steps to recalculate parameter 
jumps N to 1000. When the total acceptance ratio is 
set to Pdcsirod = 66% (solid line), the algorithm is able 
to change all parameter jumps (see figure [Sl^b)), mak- 
ing the acceptance ratio Ri of every parameter equal to 
Ji!dosircd/w = 22% and thus the total acceptance ratio 
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FIG. 3: (color online) a) Acceptance ratio Ri for parameters 
A,W,C involved in the fit of the Gaussian following equation 
ini ( red triangles, green squares and blue circles respectively) 
when i?dosirod is set to 66% and 9% (solid and dashed lines), 
b) Parameter step size as a function of the number of steps 
(line and symbols code as in figure a). The inset shows a cut 
through the hypersurface along A and C directions fixing 
W to the best fit value. 



FIG. 4: (color online) a) Acceptance ratio Ri for parameters 
A (triangles), W (squares), C (circles) involved in the fit of 
the Gaussian following equation [9] when initial parameter step 
sizes are set to APi — 10 (dashed line) and APi — 10"* 
(dotted line), b) Parameter step size as a function of the 
number of steps (lines and symbols as in figure a). 



R to 66%. The same happens if the acceptance is set 
to 9%: the algorithm finds the parameter step sizes (see 
dashed line in Fig. ^h)) which yield a total acceptance 
ratio of 9% within the first 5000 steps, no matter how 
the parameter step sizes were initialized. 

To explicitly show how this is linked with the geomet- 
rical features of the landscape, the inset of figure [3l^b) 
shows a cut of the hypersurface along parameters A 
and C, leaving parameter W fixed to its best fit value 
H^BF- As can readily be seen, the x^l^iC'jW^ = Wbf} 
hypersurface is sloppy in the direction of parameter A 
and stiff in the direction of parameter C. The algorithm 
has thus correctly calculated a parameter step size which 
is larger for A than for C, along whose direction the 
well is narrower. This fact makes the final parameter 
step sizes proportional to the errors of each parameter - 
if the global minimum is not multimodal, is quadratic in 
all parameters, and those arc not correlated. 

In order to show the robustness of the algorithm, we 
have also made disparate initial guesses for parameter 
step sizes AP^'^^ about three decades below the correct 
acceptance ratio, setting i?dosircd = 9%. As displayed in 
figureHl after about 5000 steps the acceptance ratio R {N 
is again 1000 steps) has already reached the desired value. 
It can be seen in figure EJ^a) that the acceptance ratio 
for each parameter reaches again the value i?desired/'Ti = 
3% and parameter step sizes are virtually equal to those 
obtained previously as shown in figure |4ljb). 

To stress the relevance of the aforementioned algorithm 
to explore the parameter space correctly, thus assuring its 
convergence, we have calculated the normalized Ax^PDF 
in all tested cases. As can be seen in figure [51 the Ax^ 
PDF after 10^ steps matches the chi-square distribution 



P(Ax2)(x (Ax2)("-')exp(^-^) (10) 

with m = 3 as expected Q . In figure [S] we show the 
Ax'^ PDF obtained after 10"*^ steps for different cases: 
first setting AP-'^^^ equal to the value calculated by the 
algorithm and second setting AP^'^^ equal to the initial 
guess and finally to a value, calculated a posteriori, which 
is proportional to the best fit parameters AP"'^'^^ ~ O.IP; 
(inset of figure [5|) 

As can be seen in figure [51 when AP™'^^ is set much 
higher than the optimal step sizes, the Metropolis algo- 
rithm scans the whole parameter space {Pi}, but jump- 
ing between disparate regions with very different values 
of x^ , therefore with a low acceptance rate of new param- 
eter sets (dashed line in figure [S]). This causes a poor ex- 
ploration of parameter space. In contrast, a small value 
over-explores only a restricted portion of {Pi}, falling 
very often in local minima of the parameter space (dot- 
ted line in the same figure). Also choosing parameter 
jumps proportional to the final parameters leads to a 
poor exploration of parameter space (solid line in the 
same figure). Finally, after the same number of steps, 
when using the optimized parameter step sizes obtained 
by the algorithm the x^ PDF follows the theoretical ex- 
pectation, meaning that the parameter space is correctly 
sampled. 

Fitting in a complex x^ landscape 

As pointed out before, one of the main problems when 
dealing with data fitting using the LM algorithm is to 
find a proper set of initial parameters close enough to the 
global minimum of the x^{Pi} hypersurface. As an exam- 
ple we show in figure [SI the function sm{x/W) for W = 5 
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FIG. 5: The dashed hne represents a chi-square distribu- 
tion for three parameters, i. e. m = 3 (see text for details). 
SoUd line is the obtained PDF associated to Ax^ when cal- 
culated for 10^ steps. Circles represent the same distribution 
when calculated using only 10^ steps. The inset shows the 
PDFs when calculated with parameters allowed to change 
with APi = 10"*, APi = 10, AP = O.IP. Successive PDFs 
are displaced on the ordinate axis for clarity of the figure. 



affected by a normal distributed error with cr = 0.1. In 
figure lllja) we show the x^{W^} landscape associated to 
the generated function. As it can be seen, the x^{W^} 
landscape for this function has a great number of local 
minima and a global minimum at 11^ = 5. We have fit- 
ted the function using the LM algorithm and initializing 
the parameter at Wi = 2 and Wi = 15 (see figure [6|. 
As expected, both fits were not able to find the global 
minimum that fits the function. In fact only if the LM 
algorithm is initialized between W = 3.6 and W = 9.0 it 
is able to succeed in fitting the data. 

We now test the ability of our algorithm to jump across 
barriers delimiting successive local minima to find the 
global one. For this task we have used the simulated an- 
nealing method, decreasing the temperature one decade 
every 3000 steps from T — 1000 to T = 1. The parame- 
ter jump calculation has been performed every — 1000 
steps. While the initial temperature allows to explore 
wide regions of the parameter space, the last tempera- 
ture will let the acceptance be determined only by the 
real errors of the data. 

In figure [7{b) we show the parameter W a.s a function 
of algorithm step for the two aforementioned initializa- 
tions together with the landscape (a). Parameter step 
sizes were initialized after a first run of optimization of 
2000 steps. As can be seen in this figure, after 3000 steps 
both runs have already reached the absolute mini- 
mum. Successive steps just relax the system to the final 
temperature T = 1. 

As it can be seen in figure [71 the way the minimum is 



FIG. 6: (color online) Synthetic sin(a;/5) function (circles) 
together with the best fit using parameter step sizes tuning 
together with simulated annealing (line). Dashed lines are the 
fits using the LM algorithm with starting parameters Wi = 2 
and Wi = 15. 



reached depends on the parameter initialization. Param- 
eter step sizes are larger for the run started with Wi = 15 
with a flat local minimum. The contrary happens with 
the run initialized at Wi ~ 2, parameter step sizes are 
set small due to the narrow wells of the landscape in 
this region. However, both runs arc able to avoid get- 
ting stuck in local minima, jumping over rather high x^ 
barriers and successfully reaching the best fit. 



CONCLUSION 

Classical fit schemes are known to fail when the pa- 
rameters are not initialized close enough to the final so- 
lution. We have proposed in this work to use an Adap- 
tive Markov Chain Monte Carlo Through Regeneration 
scheme, adapted from that of Gilks et al. 



15|, combined 



with a simulated annealing procedure to avoid this prob- 
lem. 

The proposed algorithm tunes the parameter step size 
in order to assure that all of them are accepted in the 
same proportion. Geometrically the parameter step size 
is set large when a cut of X^{Pi} along this parameter 
is flat, i.e. when the change of the x^{Pi} hypersurface 
along this parameter is sloppy. Similarly the parameter 
step size is set small if X^{Pi} wells are narrow. 

Moreover, the step sizes can be modulated by a tem- 
perature added to the acceptance equation that makes 
jumps across barriers easier, i. e. using a simulated 
annealing method From a geometric point of view, 
a high temperature makes the X^{Pi} wells artiflcially 
broader, smearing out details of local minima. This is 
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FIG. 7: (color online) (a) X^{W^} landscape obtained for the 
function sm{x/W) with a normal error associated of cr = 0.1 
(see figure [Gjl . (b) Algorithm steps for two different initial- 
izations , black solid line for Wi = 2 and red dashed line for 
Wi = 15, as a function of parameter W 



important at the first stages of a fit process. At final 
stages of the fitting, temperature is decreased, making 
parameter jumps smaller, and thus allowing the system 
to relax, once it is inside the global minimum. 

By fitting simulated data including statistical errors we 
verified that our algorithm actually fulfills the require- 
ments of ergodicity (it converges to the target distribu- 
tion), robustness (the ability to reach the minimum 
independent of the choice of starting parameters), ability 
to escape local minima and to explore efficiently the 
landscape, and guarantee that it will self tune to con- 
verge to the global minimum avoiding an infinite search 
with large steps. 

More complex problems have already successfully been 
studied with this algorithm such as model selection using 
Quasielastic Neutron Scattering data , non- functional 
fits in the case of dielectric spectroscopy [l^ or find- 
ing the molecular structure from diffraction data with a 
model defined by as many as 27 parameters [23|- In the 
last case, the proper initialization of parameters to use 
a LM algorithm would have been a difficult task, made 
easy by the use of the presented algorithm. 
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